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ABSTRACT 



The index n of a power law power spectrum of primordial density fluctuations, 
P{k) oc A:", has been estimated using many different techniques. The most precise 
compare the COBE^ DMR large angular scale AT to the amplitude of the large 
scale structure, but these are also the most model-dependent. The COBE DMR 
AT has also been compared to the degree-scale AT from several experiments. And 
finally, a relatively model-independent value of n can derived from the COBE data 
alone, but the small range of angular scales covered by COBE limits the precision 
of these methods. 



Introduction 



In this paper I compare several different methods for determining the spectral index n 
of the power spectrum of primordial density perturbations. All of the determinations that 
use COBE data are statistically compatible with the n ~ 1 predicted by the inflationary 
scenario. Because the largest scales appear as large angular scale features on the finite solid 
angle of sky that is available for viewing, the statistical uncertainties in the determination of 
n cannot be conquered by the usual expedient of getting more data. A careful consideration 
of the statistical methods used to analyze the large-angular scale COBE data is needed. 



Biased Statistics 



As an example of the pitfalls of statistical analysis, consider the maximum likelihood 
method applied to determining the standard deviation of a Gaussian from a set of N 
independent samples drawn from the distribution. The likelihood function is 

N 

L{fx, a) = {V2^a)-^ JJ exp[-0.5(xi - fif/o-^] (1) 

i=l 

which is maximized at /i = A^"'^ J^i^i and cr^ = A^^^ J^ii^t ~ f^Y- While the sample mean is 
an unbiased estimate of the population mean, the variance estimate is biased by a factor of 
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Fig. 1. — Likelihood contours vs and n, based on tlie 2 year 53 x 90 cross power spectrum 
for 3 < £ < 30. 



{N — 1)/N, illustrating the fact that maximum likelihood estimators are only asymptotically 
unbiased. Even with an unbiased estimate of a, the logarithm of an unbiased estimate 
of cr is not an unbiased estimate of the logarithm of a because noise rectification by the 
second derivative of the logarithm leads to a bias of — l/4iV. Since the value of when 
estimating the power in the fth multipole is N ki (2£+ l)f2obs/47r, and since Qobs is < Sn/S 
due to galactic contamination, these biases will be most signifcant for the low €s measured 
by COBE. Thus any method for determining n using COBE data should be tested using 
simulated data to calibrate these biases. 

One non-recommended technique for determining n is to treat the integrated likelihood 
f{n) = J L{Q,n)dQ as a probability density for n. The usual justification for this is the 
Bayesian rule that the probability density for Q and n after the experiment, Pa{Q,n), is 
given by 

Pa{Q, n) oc pp{Q, n)L{Q, n) (2) 
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Fig. 2. — Likelihood contours vs as and n, based on the 2 year 53 x 90 cross power spectrum 
for 3 < £ < 30. 

where Pp is the prior density. In this case it is true that 

Pa{n) = J Pa{Q, n)dQ oc j Pp{Q, n)L{Q, n)dQ. (3) 

Assuming a "uniform" prior to represent prior ignorance then gives the form in Equation 
^ But a uniform prior in is not the same as a uniform prior in InQ, and they give 
different values of n. Different ways of expressing our prior ignorance should not affect 
the answer. A more dramatic example is shown in Figure [T| and Figure each showing 
the likelihood contours for the Hauser- Peebles cross power spectrum of the 53 x 90 2 year 
maps. In one case they are plotted versus = T|, while in the other case they are plotted 
versus the relative mass fluctuations in 8/i~^ Mpc spheres, cig oc T-eso- The Jacobian of the 
transformation between {Q'^,n) and (Teso,^) depends on n, so a uniform prior in {Q'^,n) 
becomes an n-dependent prior in (Teso,^). Hence the / L{Q,n)dQ'^ peaks at a much lower 
n than the / L{Q[as, n], n)das- 



There is a better way, which is to use the maximum of L{Q,n) for a fixed n to 



generate the marginal likelihood over n. This approach to "uninteresting" parameters is 
recommended by Avni (1976). The maximum value does not require a Jacobian when 
transforming to different amplitude variables, so prior ignorance of gives the same 
answer as prior ignorance of Tgso • 

The process of determining an amplitude parameter (usually {Q\ms)^'^) ^"^^ 
spectral index n from the COBE maps is an extreme example of data reduction. In this 
process one takes the 360 x 10^ DMR data samples per year and produces maps with 
6 X 6144 values, and from these maps one calculates a smaller number of statistics. In the 
final step, {Q\msY'^ ^"^^ ^ estimated using the values of the statistics, leaving only 2 
values derived from nearly 10^ input values. This description is general enough to describe 
both the Gorski (1994) method using linear statistics and the methods involving quadratic 
statistics: the correlation function used by Bennett et al. (1994) and the Hauser- Peebles 
power spectrum used by Wright et al. (1994). The final result of any of these analysis 
methods is the values Qobs and Uobs determined from the real data, as well as an estimate 
(fi for the noise standard deviation in one observation. 



3. Monte Carlo Simulations 



In order to test these methods for biases, it is necessary to simulate both the cosmic 
variance, which gives a random map with random spherical harmonic amplitudes chosen 
from a Gaussian distribution with a variance determined from the chosen Qj„ and nj„, and 
the experimental variance., which gives the 360 million noise values needed per year. While 
programs to simulate the DMR time-ordered data do exist, none of the groups mentioned 
above have worked at this level of detail. Instead, they have used simulations that start 
with the maps. 

The effect of noise on the map production process can be simulated using 

T = aiA-^-^U (4) 

where o"i is the noise in one observation, U is an uncorrelated vector of unit variance zero 
mean Gaussian random variables, and A is the matrix with diagonal elements An equal 
to the number of times the i^^ pixel was observed, and off-diagonal elements —A^j equal 
to the number of times the i*'* pixel was referenced to the pixel. Even though A is 
singular, Wright et al. (1994) give a rapidly convergent scries technique for generating noise 
maps. Thus each noise map depends on 6144 independent Gaussian unit variance random 
variables and the parameter o"i. 

The signal map that is added to the noise maps to give the "observed" maps 
is generated using independent Gaussian random amplitudes. Bond & Efstathiou 
(1987) show that the expected variance of the coefficients in a spherical harmonic 
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Fig. 3. — Each point is an input parameter set that is consistent with the real data for a 
given reahzation of the random cosmic and radiometer variance processes. The hkehhood 
contours are at A(— 21nL) = 1, 4 and 9. 



expansion of the CMBR temperature given a power law power spectrum P{k) oc /c" is 
< > oc r[£+ {n - l)/2]/r[£+ (5 - n)/2] for £ < 40, with the constant of proportionality 
chosen so that 5 < > /Att = Q^. The simulations done by Wright et al. (1994) included 
£'s up to 39, so the signal map depends on 1600 Gaussian independent unit variance 
random variables and the two parameters Q and n, which I shall call Qin and riin below to 
distinguish them from the fitted values. 

The resulting Monte Carlo map depends on a set of random variables {Z} (1600 
+ 6144 elements for a one map analysis, or 1600 + 12288 for a cross- analysis needing 
two maps) with a known distribution, and the three parameters Qm, J^-m a-^d (Ti. (Ti can 
be determined with great precision using the time-ordered data. Hence one needs to run 
many Monte Carlo simulations with different values of Qin and and compare the fitted 
values Qout and nout to the fitted values for the real data, Qobs and nobs- For any given 



realization of {Z}, the fitted values Qout and Uout are a continuous function of the input 
parameters Qm and nj„, and one can choose values Qj„ = Q match and nj„ = rimatch such 
that Qout = Qobs and no^t = Uobs- By choosing many different realizations of {Z}, one 
creates many different Q match, nmatch pairs. The density of the points in the Q match, nmatch 
plane defines a probability density function for the true parameters Qtrue,ntrue that does not 
depend on any prior knowledge but does depend on the experimental result in a reasonable 
way. The random element in the process comes from {Z}, whose properties are known. 
Figure ^ shows this cloud of points for the 2 year 53 x 90 cross-power spectrum. The bias 
(An = 0.1) in the Gaussian approximation maximum likelihood method applied to the 
quadratic power spectrum statistics is only 0.25a, so the shift between the points and the 
contours is hard to see. 

The method using linear statistics (Gorksi 1994) has the advantage that the Gaussian 
expression for the likelihood is exact. A further advantage of this method is that any 
non-singular linear transformation of the basis functions will give the same answer, since 
the covariance matrix will change to cancel the change in the values of the statistics. While 
this means that the original motivation for generating a set of basis functions orthonormal 
in the cut sky is lost, one still has the advantage that basis functions orthogonal to any 
number of low order multipoles are easy to find. Gorski et al. (1994) find (for 3 < £ < 30) 
that the maximum of L{Q,n) occurs at n = 1.02 for the combined 2 year 53 GHz plus 
90 GHz map, and Monte Carlo simulations show that the bias in this application of the 
maximum likelihood method is small. 

Note that the 3 < £ < 30 cross power spectrum fits in Wright et al. (1994) still include 
the off-diagonal effect of the quadrupole on higher €s, while those in Gorski et al. (1994) 
are completely independent of the quadrupole. The modified Hauser-Peebles method in 
Wright et al. (1994) uses basis function defined using 

^ _ -foO < ^QoFim > -J-^ Fim' < Fim'Fim > ,r\ 

(j"£m — ^£m ^ — ^ 2^ ^ • [o) 

< -TOO-TOO > m,'=-l i'lm'i'lm' > 

where the Ffm are real spherical harmonics and the inner product < fg > is defined over the 
cut sphere. These functions Gim are orthogonal to monopole and dipole terms on the cut 
sphere. Call this the MD method since the basis functions are orthogonal to the monopole 
and dipole. Let the MDQ method use basis functions orthogonal to the monopole, dipole 
and quadrupole: 



. • (6) 

2m' > 



_ p Fqq < FooFfm > ^ Fim' < Fim'F^m > ^ -^2™' < F2m'Fl 

^ < FqqFqq > m' = -l Fim'Fim' > m'=-2 ^ -^2m'-^2m' 

Changing from the MD method to the MDQ method causes the mean power in T| for 
n = 1 Monte Carlo skies to go down by 31% while T| for the real sky goes up by 16%. 
This leads to a higher i = 4 point and a lower value of n (n = 1.02 instead of 1.22 for the 
53 X 90 cross-power spectrum). Figure ^ shows the hexadecapole power in /iK^ measured 
two different ways: on the x axis the MD method; and on the y axis T| measured using 
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Fig. 4. — Hexadecapole power determined using the MD method on tlie x-axis vs. the MDQ 
method on the y-axis for the real sky (open circle) and n = 1, {Qji]\js)^'^ = /^K simulations. 

the MDQ method. The real sky is shown as the open circle, while the dots are n = 1, 
(Q'rms)^'^ = 17 fiK Monte Carlo simulations. One sees that the real sky is moderately far 
out on the upper edge of the cloud of simulations, and this produces the 0.5a shift in n 
when changing basis functions. One also sees that the distribution of is quite skewed, 
which explains the bias in the method that maximizes the Gaussian approximation to the 
likelihood. 

Figure ^ shows the maximum likelihood values of n from fits to 3 < £ < 30 for 1800 
Monte Carlo runs with rij^ = 1 and Qin = 17 /iK. The x-axis shows n computed using 
the MD method, while the y-axis shows the results of the MDQ method. The real sky 
is shown as the open circle, and the mean of the 1800 Monte Carlo spectra is shown as 
the closed circle. This figure shows that the two methods are generally consistent, with 
the real sky moderately far out in the scatter. (The linear features for n = 1.00, 1.25 and 
1.50 are caused by the interpolation among input values spaced by Ariin = 0.25 during the 
maximum likelihood fits.) The overall performance of the MD method is better, with a bias 
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Fig. 5. — Spectral index n using the MD method on the ,T-axis vs. n from the MDQ method 
for 1800 Monte Carlo skies with riin = 1 and Qin = 17 /xK. The real sky is the open circle. 



that is 20% smaller and a standard deviation that is 8% smaller than those given by the 
MDQ method. Of course fits that include the quadrupole {2 < £ < 30) do even better, since 
they use more information. So the final result of the power spectral analysis is ambiguous: 
n ~ 1.4 if the quadrupole is included in the fit, n ^ 1.25 excluding the quadrupole using the 
MD method, or n ^ 1.0 when rigorously excluding the quadrupole using the MDQ method. 
The existence of all these options raises the specter of "optional stopping" , a time-honored 
method of introducing systematic errors into measurements. But fortunately this whole 
range of values is within the statistical uncertainty. 

Even in very simple cases this level of disagreement between different estimation 
techniques is common. For example, the RMS difference between the median and the mean 
of a set of Gaussian random numbers is 3/4 of the standard deviation of the mean. 



4. Degree-Scale 



The experiments at ~ 1° scale offer the possibility of a better determination of the 
primordial power spectrum index n, but the mo del- dependent effects of the wing of the 
Doppler peak at £ ~ 200 must be allowed for. Even in the large angle region i < 30 small 
model-dependent corrections must be made. A Cold Dark Matter (CDM) model with a 
primordial spectral index Upri = 0.96 has an apparent index Uapp = 1.1 due to the "toe" of 
the Doppler peak that extends into the i < 70 region. Wright et al. (1994) have made this 
comparison with degree scale experiments, and the resulting value for Upri is given in Table 

B 



5. Large Scale Structure 

A comparison of the extremely large scale structure seen by COBE to the large scale 
structure seen in studies of the clustering of galaxies also leads to an estimate of Upri- The 
uncertainty in this method is decreased because of the large range of scales covered, but 
also increased due uncertainties in the models of large scale structure formation. However, 
this comparison strongly favors n = 1. Prior to the COBE announcement of anisotropy. 
Peacock (1991) gave a implicit prediction that for n = 1 the amplitude of AT should be 
(QjiMs)^'^ = 18.8 /iK. Peacock & Dodds (1994) have extended this analysis of large scale 
structure and 1 get a result Upri = 0.99 ± 0.16 from their paper after correcting for their 
incorrect {Q^msY'^ ~ A*-^ ^"^^ increasing the uncertainty to allow for the uncertainty 
in the IRAS bias, hi. In Figure ^ 1 have "extended" Figure 6 from Peacock & Dodds to 
include the COBE datum, and show extrapolations with n = 0.5, 1, & 1.5 through the 
COBE point. This result assumes that Q = 1, but Peacock & Dodds have also found that 
^0.6/^^ ^ 1.0 ±0.2. 



6. Summary 

In conclusion, both the COBE AT data alone and the ratio of the COBE AT 
data to 1° scale AT are consistent with the n ^ 1 prediction of the inflationary scenario. 
Furthermore, the implied level of gravitational potential perturbations is sufficient to 
produce the observed large scale (100 Mpc) structure if both the n = 1 and = 1 
predictions of inflation are correct, and the Universe is dominated by Dark Matter. 
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Table 1: Spectral index determinations 
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Fig. 6. — An extended version of Figure 6 from Peacock & Dodds, showing n = 0.5, 1 and 
1.5 extrapolations through the COBE point. 



